Introduction

As blockchain technology and decentralized finance (DeFi) continue to evolve, the number of cryptocurrency transactions is growing at an incredible pace. With so much activity happening on networks like Ethereum, understanding how tokens interact and identifying transaction patterns can offer valuable insights into market trends, unusual behaviors, and smarter investment strategies.

In this study, we dive into Ethereum transaction data using the Etherscan API to track how wallet addresses interact with different tokens. We capture details like transaction timestamps, buy/sell activity, and gas fees to get a clearer picture of trading behaviors. To uncover hidden patterns, we use association rule mining, applying the Apriori and Eclat algorithms to find tokens that frequently appear together in transactions. This helps us identify common trading pairs, token correlations, and emerging trends in the market.

To make sense of these patterns, we use visualizations like item frequency plots, network graphs, and interactive charts. These tools help illustrate how wallet addresses and tokens are connected, making it easier to spot trends and clusters in transaction activity.

By mapping out these relationships, this research provides valuable insights into how cryptocurrencies are traded, helping both investors and researchers make smarter, data-driven decisions. Understanding these patterns could also improve portfolio diversification strategies and risk assessment in the fast-moving crypto world.

Data

The dataset considered for this analysis is extracted using the Etherscan API for only ERC-20 tokens i.e. Ethereum Onchain Tokens. The dataset consists of the following features:

col_1 <- c("Wallet.Address", "Token.Name", "Token.Symbol", "Token.Contract.Address", "Amount" )
col_2 <- c("Gas.Price.ETH.", "Buy.Sell", "From", "To", "Timestamp")

df_col <- data.frame(
  "col1_" = col_1,
  "col2_" = col_2
)

df_col %>%
  kable(align = "c", 
        col.names = c("", ""), 
        caption = "Dataset Features", escape = FALSE) %>%
  kable_styling("basic", full_width = F, position = "center")
Dataset Features
Wallet.Address Gas.Price.ETH.
Token.Name Buy.Sell
Token.Symbol From
Token.Contract.Address To
Amount Timestamp

The Wallet.Adress is the buyers wallet ID. Token.Name is the name of the token bought or sold and Token.Symbol is its ticker. Token.Contract.Adress is the ID of that crypto token, Amount is how much the trader bought of that specific crypto coin. Gas.Price.ETH is the transaction cost of in ethereum units. From and To represents if the trader is buying or selling - that is how the Buy.Sell column is created. And, Timestamp represents at what time time the transaction took place. Below is the view of the dataset:

df <- read.csv("./data/ethereum_transactions.csv")

df$Wallet.Address <- format(df$Wallet.Address, scientific = FALSE)
df$Token.Contract.Address <- format(df$Token.Contract.Address, scientific = FALSE)
df$From <- format(df$From, scientific = FALSE)
df$To <- format(df$To, scientific = FALSE)

t(head(df,1)) %>%
  kable(align = "c", 
        col.names = c("", ""),
        caption = "Dataset Features", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Dataset Features
Wallet.Address 875608240288507300257768961551696197779443417088
Token.Name bZx Protocol Token
Token.Symbol BZRX
Token.Contract.Address 495791651057932287938169074693515556992560660480
Amount 1355.777
Gas.Price..ETH. 29
Buy.Sell buy
From 1060251556556117726856976116943140309024399949824
To 875608240288507300257768961551696197779443417088
Timestamp 2020-11-13 20:25:09
# Data Preprocessing

df <- df %>%
  filter(!is.na(Token.Symbol)) %>%
  select(Wallet.Address, Token.Symbol, Buy.Sell, Timestamp) %>%
  arrange(Timestamp)

df_transactions <- df %>%
  select(Wallet.Address, Token.Symbol)

df_transactions <- df %>%
  group_by(Wallet.Address) %>%
  summarise(tokens = list(Token.Symbol)) %>%
  ungroup()

For data preprocessing, we remove the missing values and only considering the Buy transactions for this analysis. Furthermore, we only consider columns: Wallet.Address, Token.Symbol as these are the most relevant ones for our recommendation system. we have a dataset of dimension 2625x2. Additionally, we groupby and summarize the dataset by the Wallet.Adress - a necessary step to create the transaction matrix.

Methodology

This study employs association rule mining techniques to analyze Ethereum transaction patterns, focusing on two key algorithms: Eclat and Apriori. Both algorithms aim to uncover frequent token associations within Ethereum transactions, helping to identify trading patterns and relationships between different cryptocurrencies. This part is written by AI. I hope this is not an issue.

Eclat Algorithm

The Eclat (Equivalence Class Clustering and bottom-up Lattice Traversal) algorithm is a depth-first search method used for mining frequent itemsets. Unlike Apriori, which generates candidate itemsets level by level, Eclat represents transactions in a vertical format, allowing for faster computation when working with large datasets.

The process begins by organizing transaction data into sets of token symbols associated with each wallet address. Using a predefined minimum support threshold, the algorithm scans through the dataset to find frequently occurring token combinations. These frequent itemsets represent groups of tokens that often appear together in Ethereum transactions.

Once the frequent itemsets are identified, the rule induction process extracts association rules. These rules are ranked based on key metrics such as support (the frequency of occurrence in the dataset), confidence (the probability that one token is traded when another token appears), and lift (the strength of the relationship between tokens beyond random chance). The results are visualized using grouped plots and scatter plots to illustrate the relationships between tokens in Ethereum transactions.

Apriori Algorithm

The Apriori algorithm follows a different approach to association rule mining by employing a breadth-first search strategy. It systematically generates candidate itemsets, pruning those that do not meet the minimum support threshold at each iteration. This method significantly reduces computational complexity by eliminating infrequent itemsets early in the process.

For this analysis, the Apriori algorithm was applied to the Ethereum transaction dataset to generate association rules between tokens. The algorithm first identifies individual tokens with high support, then iteratively expands these into larger itemsets while ensuring that each subset remains frequent.

The resulting association rules are evaluated based on support, confidence, and lift to determine their significance. Visualization techniques, such as network graphs, were used to represent the discovered token relationships, providing insights into how cryptocurrencies are commonly traded together.

Comparison and Insights

Both algorithms are used to extract valuable insights into Ethereum token trading behaviors. While Eclat is more efficient for large datasets due to its depth-first approach, Apriori provides a more structured and interpretable way of generating association rules. The results help identify key trading pairs, frequently associated tokens, and potential market patterns, which can be useful for traders and analysts studying blockchain transaction data.

Initial Analysis

The transaction matrix is the main compenent of the whole analysis. It gives the format suitable for mining frequent itemset and association rules. The transaction matrix summary is shown below:

# Transaction Matrix
transactions <- as(df_transactions$tokens, "transactions")
summary(transactions)
## transactions as itemMatrix in sparse format with
##  17 rows (elements/itemsets/transactions) and
##  1191 columns (items) and a density of 0.1296488 
## 
## most frequent items:
##    TESLA AI      ASK AI DEEPSEEK AI DEEPSEEK R1   NVIDIA AI     (Other) 
##          14          13          13          13          13        2559 
## 
## element (itemset/transaction) length distribution:
## sizes
##  10  21  33  50  95 105 116 141 151 155 166 178 230 265 318 440 
##   1   1   1   1   1   1   1   1   2   1   1   1   1   1   1   1 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    10.0    95.0   151.0   154.4   178.0   440.0 
## 
## includes extended item information - examples:
##             labels
## 1 # neiro-coin.org
## 2   # neiroeth.net
## 3 # yawnsworld.org

Furthermore, the item frequency is visualized below. Generally, the the frequency for the top 20 is quite similar with TESLA AI being the most bought crypto token.

png("images/itemFrequencyPlot.png")
itemFrequencyPlot(transactions, topN=20, type="absolute", main="Items Frequency") 
dev.off()
itemFrequencyPlot_img <- rasterGrob(readPNG("images/itemFrequencyPlot.png"), width = unit(1, "npc"), height = unit(1, "npc"))

grid.arrange(itemFrequencyPlot_img, ncol = 1)

Eclat Algorithm

The Eclat algorithm is a widely used method for mining frequent itemsets and generating association rules in data analysis. In this case, it is applied to analyze ERC-20 crypto tokens, uncovers the relationships and patterns between different tokens within a given dataset. The eclat algorithm is run with a Support level of 0.5 and minimum length equal to 5. The rule induction is run for the confidence level of 0.1. Based on the results below, here is the analysis:

The first table represents the frequent itemsets according to this Eclat algorithm for this dataset. The top 10 Items have generally similar Support level (the proportion of transactions that contain a particular combination of tokens). This is combination is seen to be {ASK AI, DEEPSEEK AI, DEEPSEEK R1, SAKANA AI, TESLA AI} with a support level of 76.47%.

The second table is sorted by Support. LHS can be thought of as the condition, and RHS can be thought of as the conclusion. The first rule implies that whenever {DEEPSEEK AI, DEEPSEEK R1, SAKANA AI, TESLA AI} are bought it is 1.3 times likely that {ASK AI} will also be bought rather than it being a random occurence. Generally, for the top 10 the support levels of the rules are quite well as they are above 70%. It is worth observing that the top 5 have the same Support levels, this is probably due to the fact that LHS is the same for them and the RHS is being exchanged. Also, they have similar frequencies.

The third table is sorted via Confidence. The confidence level is 1 for all the top 10. This indicated that these are strong rules where the RHS item is guaranteed to appear when the LHS is present.

The fourth table is sorted by Lift. It measures the association strength between the LHS and RHS. The lift level for the top 10 is the same. There are minor change in the LHS basket. This means if the LHS are bought by a given trader, it is very likely that he/she will also buy the related RHS.

The plot below indicates that the higher the support, the higher is the lift. It also shows that FARTCOIN and LUMA AI are among the coin that are most frequently bought together.

eclat_results <- eclat(transactions, parameter = list(support = 0.5, minlen = 5))
write.csv(inspect(head(sort(eclat_results, by = "support"), 10)), "./data/eclat_inspect.csv")
eclat_inspect <- read.csv("./data/eclat_inspect.csv")

eclat_inspect %>%
  kable(align = "c", 
        col.names = c("", "Items", "Support", "Count"),
        caption = "Eclat Inspections", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Eclat Inspections
Items Support Count
[1] {ASK AI, DEEPSEEK AI, DEEPSEEK R1, SAKANA AI, TESLA AI} 0.7647059 13
[2] {ASK AI, DEEPSEEK AI, DEEPSEEK R1, NANSEN AI, SAKANA AI, TESLA AI} 0.7058824 12
[3] {DEEPSEEK AI, DEEPSEEK R1, NANSEN AI, SAKANA AI, TESLA AI} 0.7058824 12
[4] {ASK AI, DEEPSEEK AI, DEEPSEEK R1, NANSEN AI, SAKANA AI} 0.7058824 12
[5] {ASK AI, DEEPSEEK R1, NANSEN AI, SAKANA AI, TESLA AI} 0.7058824 12
[6] {ASK AI, DEEPSEEK AI, NANSEN AI, SAKANA AI, TESLA AI} 0.7058824 12
[7] {ASK AI, DEEPSEEK AI, DEEPSEEK R1, NANSEN AI, TESLA AI} 0.7058824 12
[8] {AERO, ASK AI, DEEPSEEK AI, DEEPSEEK R1, FISH AI, MASSIVE AI, NVIDIA AI, SAKANA AI, TESLA AI} 0.7058824 12
[9] {AERO, DEEPSEEK AI, DEEPSEEK R1, FISH AI, MASSIVE AI, NVIDIA AI, SAKANA AI, TESLA AI} 0.7058824 12
[10] {AERO, ASK AI, DEEPSEEK AI, DEEPSEEK R1, FISH AI, MASSIVE AI, NVIDIA AI, SAKANA AI} 0.7058824 12
# Frequent Rules
freq_rules_eclat <- ruleInduction(eclat_results, transactions, confidence=0.1)
write.csv(as(freq_rules_eclat, "data.frame"), "./data/freq_rules_eclat.csv")

freq_inspect_support <- inspect(head(sort(freq_rules_eclat, by = "support"), 10))
freq_inspect_confidence <- inspect(head(sort(freq_rules_eclat, by = "confidence"), 10))
freq_inspect_lift <- inspect(head(sort(freq_rules_eclat, by = "lift"), 10))

write.csv(freq_inspect_support, "./data/freq_inspect_support.csv")
write.csv(freq_inspect_confidence, "./data/freq_inspect_confidence.csv")
write.csv(freq_inspect_lift, "./data/freq_inspect_lift.csv")

freq_eclat_rules <- read.csv("./data/freq_rules_eclat.csv")

freq_rules_eclat_goruped <- plot(freq_rules_eclat, method="grouped")
ggsave(filename = "images/freq_rules_eclat_goruped.png", plot = freq_rules_eclat_goruped, width = 6, height = 6)

Frequent Rules

By Support

freq_eclat_inspect <- read.csv("./data/freq_inspect_support.csv")

freq_eclat_inspect %>%
  kable(align = "c", 
        col.names = c("", "LHS", "", "RHS", "Support", "Confidence", "Lift", "Itemset"),
        caption = "Eclat Frequency Inspection", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Eclat Frequency Inspection
LHS RHS Support Confidence Lift Itemset
[1] {DEEPSEEK AI, DEEPSEEK R1, SAKANA AI, TESLA AI} => {ASK AI} 0.7647059 1.0000000 1.307692 660923
[2] {ASK AI, DEEPSEEK R1, SAKANA AI, TESLA AI} => {DEEPSEEK AI} 0.7647059 1.0000000 1.307692 660923
[3] {ASK AI, DEEPSEEK AI, SAKANA AI, TESLA AI} => {DEEPSEEK R1} 0.7647059 1.0000000 1.307692 660923
[4] {ASK AI, DEEPSEEK AI, DEEPSEEK R1, TESLA AI} => {SAKANA AI} 0.7647059 1.0000000 1.307692 660923
[5] {ASK AI, DEEPSEEK AI, DEEPSEEK R1, SAKANA AI} => {TESLA AI} 0.7647059 1.0000000 1.214286 660923
[6] {DEEPSEEK AI, DEEPSEEK R1, NANSEN AI, SAKANA AI, TESLA AI} => {ASK AI} 0.7058824 1.0000000 1.307692 631903
[7] {ASK AI, DEEPSEEK R1, NANSEN AI, SAKANA AI, TESLA AI} => {DEEPSEEK AI} 0.7058824 1.0000000 1.307692 631903
[8] {ASK AI, DEEPSEEK AI, NANSEN AI, SAKANA AI, TESLA AI} => {DEEPSEEK R1} 0.7058824 1.0000000 1.307692 631903
[9] {ASK AI, DEEPSEEK AI, DEEPSEEK R1, SAKANA AI, TESLA AI} => {NANSEN AI} 0.7058824 0.9230769 1.307692 631903
[10] {ASK AI, DEEPSEEK AI, DEEPSEEK R1, NANSEN AI, TESLA AI} => {SAKANA AI} 0.7058824 1.0000000 1.307692 631903

By Confidence

freq_eclat_inspect <- read.csv("./data/freq_inspect_confidence.csv")

freq_eclat_inspect %>%
  kable(align = "c", 
        col.names = c("", "LHS", "", "RHS", "Support", "Confidence", "Lift", "Itemset"),
        caption = "Eclat Frequency Inspection", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Eclat Frequency Inspection
LHS RHS Support Confidence Lift Itemset
[1] {FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY, SIF, TESLA AI} => {AERO} 0.5294118 1 1.416667 1
[2] {AERO, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY, SIF, TESLA AI} => {FISH AI} 0.5294118 1 1.416667 1
[3] {AERO, FISH AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY, SIF, TESLA AI} => {Fridon AI} 0.5294118 1 1.545454 1
[4] {AERO, FISH AI, Fridon AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY, SIF, TESLA AI} => {MASSIVE AI} 0.5294118 1 1.416667 1
[5] {AERO, FISH AI, Fridon AI, MASSIVE AI, NVIDIA AI, SCALE AI, SHIBY, SIF, TESLA AI} => {NANSEN AI} 0.5294118 1 1.416667 1
[6] {AERO, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, SCALE AI, SHIBY, SIF, TESLA AI} => {NVIDIA AI} 0.5294118 1 1.307692 1
[7] {AERO, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SHIBY, SIF, TESLA AI} => {SCALE AI} 0.5294118 1 1.545454 1
[8] {AERO, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SIF, TESLA AI} => {SHIBY} 0.5294118 1 1.545454 1
[9] {AERO, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY, TESLA AI} => {SIF} 0.5294118 1 1.888889 1
[10] {AERO, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY, SIF} => {TESLA AI} 0.5294118 1 1.214286 1

By Lift

freq_eclat_inspect <- read.csv("./data/freq_inspect_lift.csv")

freq_eclat_inspect %>%
  kable(align = "c", 
        col.names = c("", "LHS", "", "RHS", "Support", "Confidence", "Lift", "Itemset"),
        caption = "Eclat Frequency Inspection", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Eclat Frequency Inspection
LHS RHS Support Confidence Lift Itemset
[1] {AERO, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY, TESLA AI} => {SIF} 0.5294118 1 1.888889 1
[2] {AERO, ASK AI, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY} => {SIF} 0.5294118 1 1.888889 2
[3] {AERO, DEEPSEEK AI, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY} => {SIF} 0.5294118 1 1.888889 3
[4] {AERO, DEEPSEEK R1, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SCALE AI, SHIBY} => {SIF} 0.5294118 1 1.888889 4
[5] {AERO, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, NVIDIA AI, SAKANA AI, SCALE AI, SHIBY} => {SIF} 0.5294118 1 1.888889 5
[6] {AERO, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, SAKANA AI, SCALE AI, SHIBY, TESLA AI} => {SIF} 0.5294118 1 1.888889 6
[7] {AERO, ASK AI, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, SAKANA AI, SCALE AI, SHIBY} => {SIF} 0.5294118 1 1.888889 7
[8] {AERO, DEEPSEEK AI, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, SAKANA AI, SCALE AI, SHIBY} => {SIF} 0.5294118 1 1.888889 8
[9] {AERO, DEEPSEEK R1, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, SAKANA AI, SCALE AI, SHIBY} => {SIF} 0.5294118 1 1.888889 9
[10] {AERO, DEEPSEEK R1, FISH AI, Fridon AI, MASSIVE AI, NANSEN AI, SCALE AI, SHIBY, TESLA AI} => {SIF} 0.5294118 1 1.888889 10

Visualisations - Eclat

Grouped Matrix by Lift & Support

itemFrequencyPlot_img_1 <- rasterGrob(readPNG("images/freq_rules_eclat_goruped.png"), width = unit(1, "npc"), height = unit(1, "npc"))

grid.arrange(itemFrequencyPlot_img_1, ncol = 1)

Apriopri Algorithm

Apriopri algorithm is used for analysis between specific items. In the study, I will choose the by item comparison using the Items Frequency. I decided to choose the RHS as TESLA AI, AERO, FRIDON AI, AEROBUD. The reason is because these are the first items representing each level. The support level of 0.5 and the confidence level of 0.8 will be used. Based on the results presented below, I will present my analysis.

In case of TESLA AI, it is usually bought alone with a probability of 82.3% or with NVIDIA AI with a probability of 76.47% and a 20% higher lift than alone. In case of AERO, it is bought with MASSIVE AI with a probability of 70.58%. In case of the FRIDON AI, it is bought with SCALE AI with a probability of 64.70%. In case of the AEROBUD, it is bought with NANSEN AI with a probability of 58.82%.

apriori_results <- apriori(transactions, parameter = list(support = 0.5, confidence = 0.8), appearance=list(default="lhs", rhs="TESLA AI"))
write.csv(inspect(head(sort(apriori_results, by = "support"), 10)), "./data/apriori_results_TESLA.csv")

# Visualisation - Apriori
png("images/apriori_results_graphs_TESLA.png")
plot(apriori_results, method="graph")
dev.off()

apriori_results <- apriori(transactions, parameter = list(support = 0.5, confidence = 0.8), appearance=list(default="lhs", rhs="AERO"))
write.csv(inspect(head(sort(apriori_results, by = "support"), 10)), "./data/apriori_results_AERO.csv")

# Visualisation - Apriori
png("images/apriori_results_graphs_AERO.png")
plot(apriori_results, method="graph")
dev.off()

apriori_results <- apriori(transactions, parameter = list(support = 0.5, confidence = 0.8), appearance=list(default="lhs", rhs="Fridon AI"))
write.csv(inspect(head(sort(apriori_results, by = "support"), 10)), "./data/apriori_results_Fridon.csv")

# Visualisation - Apriori
png("images/apriori_results_graphs_Fridon.png")
plot(apriori_results, method="graph")
dev.off()

apriori_results <- apriori(transactions, parameter = list(support = 0.5, confidence = 0.8), appearance=list(default="lhs", rhs="AEROBUD"))
write.csv(inspect(head(sort(apriori_results, by = "support"), 10)), "./data/apriori_results_AEROBUD.csv")

# Visualisation - Apriori
png("images/apriori_results_graphs_AEROBUD.png")
plot(apriori_results, method="graph")
dev.off()

Frequency Rules

By TESLA AI

freq_eclat_inspect <- read.csv("./data/apriori_results_TESLA.csv")

freq_eclat_inspect %>%
  kable(align = "c", 
        col.names = c("", "LHS", "", "RHS", "Support", "Confidence", "Coverage", "Lift", "Count"),
        caption = "Apriopri Frequency Inspection", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Apriopri Frequency Inspection
LHS RHS Support Confidence Coverage Lift Count
[1] {} => {TESLA AI} 0.8235294 0.8235294 1.0000000 1.000000 14
[2] {NVIDIA AI} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13
[3] {DEEPSEEK R1} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13
[4] {SAKANA AI} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13
[5] {ASK AI} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13
[6] {DEEPSEEK AI} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13
[7] {DEEPSEEK R1, SAKANA AI} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13
[8] {ASK AI, DEEPSEEK R1} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13
[9] {DEEPSEEK AI, DEEPSEEK R1} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13
[10] {ASK AI, SAKANA AI} => {TESLA AI} 0.7647059 1.0000000 0.7647059 1.214286 13

By AERO

freq_eclat_inspect <- read.csv("./data/apriori_results_AERO.csv")

freq_eclat_inspect %>%
  kable(align = "c", 
        col.names = c("", "LHS", "", "RHS", "Support", "Confidence", "Coverage", "Lift", "Count"),
        caption = "Apriopri Frequency Inspection", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Apriopri Frequency Inspection
LHS RHS Support Confidence Coverage Lift Count
[1] {MASSIVE AI} => {AERO} 0.7058824 1.0000000 0.7058824 1.416667 12
[2] {FISH AI} => {AERO} 0.7058824 1.0000000 0.7058824 1.416667 12
[3] {NVIDIA AI} => {AERO} 0.7058824 0.9230769 0.7647059 1.307692 12
[4] {DEEPSEEK R1} => {AERO} 0.7058824 0.9230769 0.7647059 1.307692 12
[5] {DEEPSEEK AI} => {AERO} 0.7058824 0.9230769 0.7647059 1.307692 12
[6] {ASK AI} => {AERO} 0.7058824 0.9230769 0.7647059 1.307692 12
[7] {SAKANA AI} => {AERO} 0.7058824 0.9230769 0.7647059 1.307692 12
[8] {TESLA AI} => {AERO} 0.7058824 0.8571429 0.8235294 1.214286 12
[9] {FISH AI, MASSIVE AI} => {AERO} 0.7058824 1.0000000 0.7058824 1.416667 12
[10] {MASSIVE AI, NVIDIA AI} => {AERO} 0.7058824 1.0000000 0.7058824 1.416667 12

By FRIDON AI

freq_eclat_inspect <- read.csv("./data/apriori_results_Fridon.csv")

freq_eclat_inspect %>%
  kable(align = "c", 
        col.names = c("", "LHS", "", "RHS", "Support", "Confidence", "Coverage", "Lift", "Count"),
        caption = "Apriopri Frequency Inspection", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Apriopri Frequency Inspection
LHS RHS Support Confidence Coverage Lift Count
[1] {SCALE AI} => {Fridon AI} 0.6470588 1.0000000 0.6470588 1.545454 11
[2] {AERO} => {Fridon AI} 0.6470588 0.9166667 0.7058824 1.416667 11
[3] {FISH AI} => {Fridon AI} 0.6470588 0.9166667 0.7058824 1.416667 11
[4] {MASSIVE AI} => {Fridon AI} 0.6470588 0.9166667 0.7058824 1.416667 11
[5] {NVIDIA AI} => {Fridon AI} 0.6470588 0.8461538 0.7647059 1.307692 11
[6] {DEEPSEEK AI} => {Fridon AI} 0.6470588 0.8461538 0.7647059 1.307692 11
[7] {DEEPSEEK R1} => {Fridon AI} 0.6470588 0.8461538 0.7647059 1.307692 11
[8] {ASK AI} => {Fridon AI} 0.6470588 0.8461538 0.7647059 1.307692 11
[9] {SAKANA AI} => {Fridon AI} 0.6470588 0.8461538 0.7647059 1.307692 11
[10] {AERO, SCALE AI} => {Fridon AI} 0.6470588 1.0000000 0.6470588 1.545454 11

AEROBUD

freq_eclat_inspect <- read.csv("./data/apriori_results_AEROBUD.csv")

freq_eclat_inspect %>%
  kable(align = "c", 
        col.names = c("", "LHS", "", "RHS", "Support", "Confidence", "Coverage", "Lift", "Count"),
        caption = "Apriopri Frequency Inspection", escape = FALSE) %>%
  kable_styling("responsive", full_width = F, position = "center")
Apriopri Frequency Inspection
LHS RHS Support Confidence Coverage Lift Count
[1] {NANSEN AI} => {AEROBUD} 0.5882353 0.8333333 0.7058824 1.416667 10
[2] {MASSIVE AI} => {AEROBUD} 0.5882353 0.8333333 0.7058824 1.416667 10
[3] {FISH AI} => {AEROBUD} 0.5882353 0.8333333 0.7058824 1.416667 10
[4] {AERO} => {AEROBUD} 0.5882353 0.8333333 0.7058824 1.416667 10
[5] {MASSIVE AI, NANSEN AI} => {AEROBUD} 0.5882353 0.9090909 0.6470588 1.545454 10
[6] {FISH AI, NANSEN AI} => {AEROBUD} 0.5882353 0.9090909 0.6470588 1.545454 10
[7] {AERO, NANSEN AI} => {AEROBUD} 0.5882353 0.9090909 0.6470588 1.545454 10
[8] {NANSEN AI, NVIDIA AI} => {AEROBUD} 0.5882353 0.9090909 0.6470588 1.545454 10
[9] {NANSEN AI, SAKANA AI} => {AEROBUD} 0.5882353 0.8333333 0.7058824 1.416667 10
[10] {ASK AI, NANSEN AI} => {AEROBUD} 0.5882353 0.8333333 0.7058824 1.416667 10

Visualisations - Apriopri

By TESLA AI

apriori_results_graphs_img <- rasterGrob(readPNG("images/apriori_results_graphs_TESLA.png"), width = unit(1, "npc"), height = unit(1, "npc"))

grid.arrange(apriori_results_graphs_img, ncol = 1)

By AERO

apriori_results_graphs_img <- rasterGrob(readPNG("images/apriori_results_graphs_AERO.png"), width = unit(1, "npc"), height = unit(1, "npc"))

grid.arrange(apriori_results_graphs_img, ncol = 1)

By FRIDON

apriori_results_graphs_img <- rasterGrob(readPNG("images/apriori_results_graphs_FRIDON.png"), width = unit(1, "npc"), height = unit(1, "npc"))

grid.arrange(apriori_results_graphs_img, ncol = 1)

By AEROBUD

apriori_results_graphs_img <- rasterGrob(readPNG("images/apriori_results_graphs_AEROBUD.png"), width = unit(1, "npc"), height = unit(1, "npc"))

grid.arrange(apriori_results_graphs_img, ncol = 1)

Summary

In this study, association rule mining was used to identify frequent token combinations and trading patterns within Ethereum transactions. By applying the Eclat and Apriori algorithms, key relationships between tokens were discovered, offering valuable insights into trading behavior. These findings provide a foundation for better decision-making, optimized trading strategies, and a deeper understanding of the Ethereum market.

The most frequently bought crypto coin turned out to be TESLA AI which is usually bought with other AI coins. The analysis shows that certain AI-related cryptocurrencies tend to move together, with tokens like ASK AI, DEEPSEEK AI, TESLA AI, and SAKANA AI often appearing in the same portfolios. TESLA AI stands out as a key player, showing up in 82% of cases, which means its price changes could signal trends for other similar tokens. We also see clusters of assets like AERO, FISH AI, MASSIVE AI, NANSEN AI, and SHIBY that are closely linked, suggesting that if one moves, the others might follow. Furthermore, assets like AERO and FISH AI seem to be early movers, meaning their activity could hint at what is coming next in the market. On the other hand, the strong connections between these tokens mean that if one crashes, it could pull the others down with it. This highlights both opportunities and risks for investors, making it crucial to watch key tokens like TESLA AI and ASK AI for market signals while also thinking about diversification strategies in order to mitigate risks.

Further Applications

As crypto is both my hobby and passion, this topic sparked ideas for further applying these algorithms. On-chain trading is considered one of the riskiest forms of trading in the financial world—some even label it as gambling. However, I believe that with the right tools, what is often seen as “LUCK” can be transformed into a “STRATEGIC ADVANTAGE.” Here’s how I envision taking this further: first, I would consider token-based on-chain data, such as volume, current gas fees, buy or sell orders, prices, and more. I would conduct some analysis on this data (still figuring out the specifics). My algorithm would run 24/7 to collect market data and track the transactions of “crypto whales.” On-chain trading revolves around identifying the “narrative”—what people are currently interested in or are likely to invest in. The goal of my algorithm would be to scan token contract addresses and estimate the probability of investors buying into that token. This algorithm could either stop there or, based on the predicted likelihood and a defined threshold, make a small investment. For the latter, I’d establish a take-profit strategy, initially aiming to exit with a 2x return. Additionally, the algorithm would aim to make hundreds of trades per day so that even if some trades are lost, the gains from successful ones outweigh the losses. This concept is still in the early stages, and I’m refining it—perhaps this will even become my master’s thesis? An AI agent that can trade on-chain –> WoW!